In this computer lab, we will learn how to use automatic tools to forecast time series, including statistical models like exponential smoothing and ARIMA models, but now with emphasizing machine learning tools like random forests or neural networks.
In particular, we will learn how to manage the package modeltime
library(tidyverse) # great collection of packages to visualize and manage datasets
library(lubridate) # to work with dates
library(fpp3)
library(tsibble) # tidy data structure for time series
library(tidymodels) # machine learning ecosystem of packages (the new caret)
library(modeltime) # automatic forecasting using machine learning
library(modeltime.ensemble) # ensembles
library(timetk) # similar to tidyverse but for time series (visualization and management of times series)Using data from AEMET, we will download daily maximum temperatures in Madrid, from 1950, and then agregate them into monthly maximum temperatures
As a summary, in Madrid the summers are hot, dry, and mostly clear and the winters are cold and partly cloudy.
Hence, the goal will be to forecast monthly maximum temperatures in Madrid
# Library to connect to the Spanish Meteorological Agency (AEMET) using its API
library(climaemet)
aemet_stations()## # A tibble: 291 × 7
## indicativo indsinop nombre provincia altitud longitud latitud
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 B013X 08304 ESCORCA, LLUC ILLES BA… 490 2.89 39.8
## 2 B228 08301 PALMA, PUERTO ILLES BA… 3 2.63 39.6
## 3 B248 08303 SIERRA DE ALFABIA, BU… ILLES BA… 1030 2.71 39.7
## 4 B278 08306 PALMA DE MALLORCA, AE… ILLES BA… 8 2.74 39.6
## 5 B346X 08310 PORRERES ILLES BA… 120 3.02 39.5
## 6 B434X 08309 PORTOCOLOM ILLES BA… 17 3.27 39.4
## 7 B569X 08311 CAPDEPERA ILLES BA… 57 3.48 39.7
## 8 B691Y 08312 SA POBLA ILLES BA… 40 3.02 39.7
## 9 B893 08314 MENORCA, AEROPUERTO ILLES BA… 91 4.22 39.9
## 10 B954 08373 IBIZA, AEROPUERTO ILLES BA… 6 1.38 38.9
## # ℹ 281 more rows
# select here a station
station <- "3195" # Madrid Retiro
temp_dat <-
aemet_daily_clim(station, start = "1960-01-01", end = "2024-02-29")
#save(temp_dat, file="temperatures.RData")
#load("temperatures.RData")
# Aggregate to month
temp_month = temp_dat %>% mutate(year.month=yearmonth(fecha)) %>% group_by(year.month) %>%
summarise(max=max(tmax, na.rm=T)) %>% arrange(year.month)
# Convert into ts
gtemp_ts = ts(temp_month$max, start=c(1960,1), frequency=12) %>% as_tsibble() %>% drop_na()temp_month %>% mutate(month=as.factor(month(year.month, label=T, abbr=T))) %>%
group_by(month) %>% mutate(medTemp = mean(max, na.rm = T)) %>%
ggplot(aes(x=month, y=max, group=month, fill=medTemp, color=medTemp)) +
geom_boxplot()+
scale_fill_gradient(low="lightblue",high="orange")+
scale_color_gradient(low="blue",high="red")+
labs(title="Maximum temperatures in Madrid (Retiro) per month of the year",
subtitle="Data from 1960 to 2024", y = "Maximum temperature in ºC", x = "")+theme_minimal()+
theme(plot.background = element_rect(fill='#212121'), text=element_text(size=14,color='#FFFFFF'),
axis.text = element_text(color='#FFFFFF'), panel.grid.major.y = element_line(color = '#55565B', linetype = "dotted"),panel.grid.major.x = element_line(color = '#55565B', linetype = "dotted"),
panel.grid.minor.y = element_blank(),panel.grid.minor.x = element_blank(),
plot.title=element_text(size=20), legend.position="none")The hottest month in Madrid was June/2019, August/2021, and July/2022, with 40.7 ºC
The modeltime package also considers tidy data, and build the time-series models using the tidymodels (new caret)
Hence, better to get some basic skills usig tidymodels: https://www.tidymodels.org/start/
Somehow, modeltime combines the library fable (for time series, no ML) with the library tidymodels (for machine learning, no time series)
With modeltime, we can train many models at the same time (arima, prophet, random forests, xgboost, neural networks, etc.)
Developed by Matt Dancho: https://business-science.github.io/modeltime/
# we need to change the format of the index date
gtemp_ts$index = as.Date(gtemp_ts$index)
gtemp_ts %>%
plot_time_series(.date_var=index, .value=value, .interactive = TRUE,
.title='Maximum temperatures in Madrid center from 1950',
.y_lab = "Monthly data in °C") The maximum temperature has increased in about 2 degrees in the last 70 years
Time-series decomposition with modeltime:
gtemp_ts %>%
plot_stl_diagnostics(index, value,
.frequency = "auto", .trend = "auto",
.feature_set = c("observed", "season", "trend", "remainder"),
.interactive = T)In an easy way, we can split the time series into training and testing sets (using the timetk library)
Last 12-months of data as the testing set, the other 20 years is the training
# forecasting horizon = 1 year
# training window = 20 years
splits <- gtemp_ts %>%
time_series_split(date_var = index, initial = "20 years", assess = "1 year")
# visualize the split
splits %>%
tk_time_series_cv_plan() %>%
plot_time_series_cv_plan(index, value, .interactive = FALSE)Let’s try a 5-fold time-series cross-validation
In real time-series, because they are non-stationary, it is usually better to consider a fixed training window than an expanding/sliding one
# forecasting horizon = 1 year
# training window = 20 years, always fixed (cumulative=F)
# forecast every year
splits <- time_series_cv(data = filter(gtemp_ts),
date_var = index,
initial = "15 years", # window for train set
assess = "1 year", # h = 12 months
skip = "12 months", # forecast every year
slice_limit = 5, # maximum number of blocks/slices
cumulative = FALSE)
# visualize the splits
splits %>%
plot_time_series_cv_plan(index, value, .interactive = FALSE)Note how the train set contains always 15 years while the test set contains 1 year
Train the models in an automatic way: first an auto.arima, then with prophet
We are going to select the first split (forecast the Mar-2023 to Feb-2024 using the previous 15 years), but we can select any other
Train three basic models, no ML: auto.arima, ets, and prophet
sp = 1 # select here the split
# First the auto.arima of Hyndman
model_fit_arima =
arima_reg() %>%
# arima_reg(seasonal_period = 12, seasonal_differences = 1, non_seasonal_differences = 1) %>%
set_engine("auto_arima") %>%
fit(value ~ index, training(splits$splits[[sp]]))
# modeltime models require a date column to be a regressor
model_fit_arima## parsnip model object
##
## Series: outcome
## ARIMA(0,0,2)(1,1,2)[12]
##
## Coefficients:
## ma1 ma2 sar1 sma1 sma2
## 0.0133 0.1375 0.1355 -1.1965 0.276
## s.e. 0.0803 0.0801 0.4427 0.4114 0.410
##
## sigma^2 = 4.181: log likelihood = -367.54
## AIC=747.08 AICc=747.6 BIC=765.82
# Error-Trend-Season (ETS) model
model_fit_ets =
exp_smoothing() %>%
set_engine("ets") %>%
fit(value ~ index, training(splits$splits[[sp]]))
model_fit_ets## parsnip model object
##
## ETS(M,N,A)
##
## Call:
## forecast::ets(y = outcome, model = model_ets, damped = damping_ets,
##
## Call:
## alpha = alpha, beta = beta, gamma = gamma)
##
## Smoothing parameters:
## alpha = 1e-04
## gamma = 1e-04
##
## Initial states:
## l = 26.8153
## s = -8.4407 -11.1048 -11.5541 -7.3178 -0.0793 6.3929
## 11.4142 10.9763 9.9194 4.4873 -0.6481 -4.0454
##
## sigma: 0.0788
##
## AIC AICc BIC
## 1201.547 1204.474 1249.441
# Now the prophet by facebook
model_fit_prophet <- prophet_reg() %>%
set_engine("prophet", monthly.seasonality = TRUE) %>%
fit(value ~ index, training(splits$splits[[sp]]))
model_fit_prophet## parsnip model object
##
## PROPHET Model
## - growth: 'linear'
## - n.changepoints: 25
## - changepoint.range: 0.8
## - yearly.seasonality: 'auto'
## - weekly.seasonality: 'auto'
## - daily.seasonality: 'auto'
## - seasonality.mode: 'additive'
## - changepoint.prior.scale: 0.05
## - seasonality.prior.scale: 10
## - holidays.prior.scale: 10
## - logistic_cap: NULL
## - logistic_floor: NULL
## - extra_regressors: 0
ML tools in time series are more difficult to deal with because we need to create first the relevant features
We are going to create the features using recipe objects (with steps)
See more details in https://www.tidymodels.org/
The recipe:
recipe_spec <- recipe(value ~ index, training(splits$splits[[sp]])) %>%
step_timeseries_signature(index) %>%
step_rm(contains("am.pm"), contains("hour"), contains("minute"),contains("day"),
contains("week"), contains("second"), contains("xts"), contains("iso"), index_month, index_quarter, index_half, index_index.num) %>%
step_dummy(all_nominal()) %>%
step_fourier(index, K = 2, period = 12)
# step_rm is to remove features we are not going to use
# Just to see the data frame with the recipe
recipe_spec %>% prep() %>% juice() %>% View()By default, many features are created automatically. Unnecessary features can be removed using recipes::step_rm()
We can fit any model using different computational engines
See more details in https://www.tidymodels.org/
Let’s try glmnet, random forest, XGBoost, neural netwotks, and prophet
For each model, we need to define a workflow: container that aggregates information required to fit and predict (model+features+train)
# glmnet
model_spec_glmnet <- linear_reg(penalty = 0.01, mixture = 0.5) %>%
set_engine("glmnet")
# Hyper-parameters can be optimized in this way: https://business-science.github.io/modeltime/articles/parallel-processing.html
workflow_fit_glmnet <- workflow() %>%
add_model(model_spec_glmnet) %>%
add_recipe(recipe_spec %>% step_rm(index)) %>%
fit(training(splits$splits[[sp]]))
# randomForest
model_spec_rf <- rand_forest(trees = 500, mode = "regression") %>%
set_engine("randomForest")
workflow_fit_rf <- workflow() %>%
add_model(model_spec_rf) %>%
add_recipe(recipe_spec %>% step_rm(index)) %>%
fit(training(splits$splits[[sp]]))
# XGBoost
model_spec_xgboost <- boost_tree(mode = "regression") %>%
set_engine("xgboost")
wflw_fit_xgboost <- workflow() %>%
add_model(model_spec_xgboost) %>%
add_recipe(recipe_spec %>% step_rm(index)) %>%
fit(training(splits$splits[[sp]]))
# NNETAR
model_spec_nnetar <- nnetar_reg(seasonal_period = 12, mode = "regression") %>%
set_engine("nnetar")
wflw_fit_nnetar <- workflow() %>%
add_model(model_spec_nnetar) %>%
add_recipe(recipe_spec) %>%
fit(training(splits$splits[[sp]]))
# Prophet with all the features
model_spec_prophet <- prophet_reg(
seasonality_yearly = TRUE
) %>%
set_engine("prophet")
wflw_fit_prophet <- workflow() %>%
add_model(model_spec_prophet) %>%
add_recipe(recipe_spec) %>%
fit(training(splits$splits[[sp]]))The modeltime table organizes the models with IDs and creates generic descriptions to help us keep track of our models
model_table <- modeltime_table(
model_fit_arima,
model_fit_ets,
model_fit_prophet,
workflow_fit_glmnet,
workflow_fit_rf,
wflw_fit_xgboost,
wflw_fit_nnetar,
wflw_fit_prophet
)
model_table## # Modeltime Table
## # A tibble: 8 × 3
## .model_id .model .model_desc
## <int> <list> <chr>
## 1 1 <fit[+]> ARIMA(0,0,2)(1,1,2)[12]
## 2 2 <fit[+]> ETS(M,N,A)
## 3 3 <fit[+]> PROPHET
## 4 4 <workflow> GLMNET
## 5 5 <workflow> RANDOMFOREST
## 6 6 <workflow> XGBOOST
## 7 7 <workflow> NNAR(1,1,10)[12]
## 8 8 <workflow> PROPHET W/ REGRESSORS
First, model calibration is used to quantify error and estimate confidence intervals
Model calibration will be performed on the out-of-sample data (testing sets) with the modeltime_calibrate() function
calibration_table <- model_table %>%
modeltime_calibrate(new_data = testing(splits$splits[[sp]]))
calibration_table## # Modeltime Table
## # A tibble: 8 × 5
## .model_id .model .model_desc .type .calibration_data
## <int> <list> <chr> <chr> <list>
## 1 1 <fit[+]> ARIMA(0,0,2)(1,1,2)[12] Test <tibble [12 × 4]>
## 2 2 <fit[+]> ETS(M,N,A) Test <tibble [12 × 4]>
## 3 3 <fit[+]> PROPHET Test <tibble [12 × 4]>
## 4 4 <workflow> GLMNET Test <tibble [12 × 4]>
## 5 5 <workflow> RANDOMFOREST Test <tibble [12 × 4]>
## 6 6 <workflow> XGBOOST Test <tibble [12 × 4]>
## 7 7 <workflow> NNAR(1,1,10)[12] Test <tibble [12 × 4]>
## 8 8 <workflow> PROPHET W/ REGRESSORS Test <tibble [12 × 4]>
Now, with calibrated data, we can forecast all the models
calibration_table %>%
modeltime_forecast(actual_data = filter(gtemp_ts,year(index)>=2018)) %>%
plot_modeltime_forecast(.interactive = FALSE) + labs(title = "Forecasts", y = "", caption = "modeltime" )Accuracy table in the first testing set
| Accuracy Table | ||||||||
| .model_id | .model_desc | .type | mae | mape | mase | smape | rmse | rsq |
|---|---|---|---|---|---|---|---|---|
| 1 | ARIMA(0,0,2)(1,1,2)[12] | Test | 1.98 | 8.04 | 0.43 | 8.19 | 2.34 | 0.93 |
| 2 | ETS(M,N,A) | Test | 1.99 | 8.03 | 0.43 | 8.22 | 2.35 | 0.93 |
| 3 | PROPHET | Test | 2.17 | 9.06 | 0.47 | 9.12 | 2.46 | 0.92 |
| 4 | GLMNET | Test | 1.93 | 7.96 | 0.42 | 8.03 | 2.37 | 0.92 |
| 5 | RANDOMFOREST | Test | 2.37 | 10.05 | 0.51 | 10.10 | 2.89 | 0.89 |
| 6 | XGBOOST | Test | 2.72 | 10.78 | 0.59 | 11.37 | 3.38 | 0.87 |
| 7 | NNAR(1,1,10)[12] | Test | 3.16 | 12.90 | 0.68 | 13.88 | 4.16 | 0.81 |
| 8 | PROPHET W/ REGRESSORS | Test | 2.09 | 8.67 | 0.45 | 8.83 | 2.48 | 0.92 |
Insights?
We should run again the previous workflow for the other splits
To do that, let’s resample the accuracy of a model (in our case using the first split) with the other slices (testing sets)
Resample forecasts: for each slice (training set), the model is re-fitted and forecasts (testing set) are provided. That is, we take the model’s specification from the first split (just the model type) and refit (re-train) it repeatedly to resampled data (the rest of the splits)
resample_results <- model_table %>%
modeltime_fit_resamples(
resamples = splits,
control = control_resamples(verbose = TRUE)
)## ── Fitting Resamples ────────────────────────────────────────────
##
## 26.513 sec elapsed
## # Modeltime Table
## # A tibble: 8 × 4
## .model_id .model .model_desc .resample_results
## <int> <list> <chr> <list>
## 1 1 <fit[+]> ARIMA(0,0,2)(1,1,2)[12] <rsmp[+]>
## 2 2 <fit[+]> ETS(M,N,A) <rsmp[+]>
## 3 3 <fit[+]> PROPHET <rsmp[+]>
## 4 4 <workflow> GLMNET <rsmp[+]>
## 5 5 <workflow> RANDOMFOREST <rsmp[+]>
## 6 6 <workflow> XGBOOST <rsmp[+]>
## 7 7 <workflow> NNAR(1,1,10)[12] <rsmp[+]>
## 8 8 <workflow> PROPHET W/ REGRESSORS <rsmp[+]>
Visualize the accuracy for each model and each slice
resample_results %>%
plot_modeltime_resamples(
.summary_fn = mean,
.point_size = 3,
.interactive = FALSE
)The average performance:
resample_results %>%
modeltime_resample_accuracy(summary_fns = list(mean = mean)) %>%
table_modeltime_accuracy(.interactive = FALSE)| Accuracy Table | |||||||||
| .model_id | .model_desc | .type | n | mae_mean | mape_mean | mase_mean | smape_mean | rmse_mean | rsq_mean |
|---|---|---|---|---|---|---|---|---|---|
| 1 | ARIMA(0,0,2)(1,1,2)[12] | Resamples | 5 | 1.68 | 6.68 | 0.38 | 6.57 | 2.13 | 0.95 |
| 2 | ETS(M,N,A) | Resamples | 5 | 1.55 | 6.14 | 0.35 | 6.10 | 1.98 | 0.95 |
| 3 | PROPHET | Resamples | 5 | 1.57 | 6.18 | 0.36 | 6.06 | 2.01 | 0.95 |
| 4 | GLMNET | Resamples | 5 | 1.56 | 6.26 | 0.35 | 6.13 | 2.00 | 0.95 |
| 5 | RANDOMFOREST | Resamples | 5 | 1.90 | 7.79 | 0.43 | 7.64 | 2.31 | 0.94 |
| 6 | XGBOOST | Resamples | 5 | 2.05 | 8.14 | 0.46 | 8.23 | 2.49 | 0.93 |
| 7 | NNAR(1,1,10)[12] | Resamples | 5 | 2.52 | 9.95 | 0.57 | 10.06 | 3.29 | 0.88 |
| 8 | PROPHET W/ REGRESSORS | Resamples | 5 | 1.60 | 6.29 | 0.36 | 6.21 | 2.02 | 0.95 |
Which are the more robust tools across different splits?
That means Arima works well but needs to be re-trained each time. On the other hand, ML tools, specially prophet and rf, do not need to be re-trained as often
Finally, let’s combine the best models using the simple average ensemble to forecast 2024
Besides testing data (last available 12 months), we are going to forecast 12 more months (completely not known). I.e. our final forecasting horizon will be \(h=24\)
future.data = testing(splits$splits[[sp]]) %>% future_frame(index, .length_out = "1 year", .bind_data = T)
# calibrate again to obtain longer forecasts
calibration_table <- model_table %>% modeltime_calibrate(future.data)
# Select here your favourite models
iaux = c(1, 2, 3, 5)
# make the ensemble
ensemble_fit_avg <- calibration_table[iaux,] %>%
ensemble_average(type = "mean")
ensemble_fit_avg## ── Modeltime Ensemble ───────────────────────────────────────────
## Ensemble of 4 Models (MEAN)
##
## # Modeltime Table
## # A tibble: 4 × 5
## .model_id .model .model_desc .type .calibration_data
## <int> <list> <chr> <chr> <list>
## 1 1 <fit[+]> ARIMA(0,0,2)(1,1,2)[12] Test <tibble [24 × 4]>
## 2 2 <fit[+]> ETS(M,N,A) Test <tibble [24 × 4]>
## 3 3 <fit[+]> PROPHET Test <tibble [24 × 4]>
## 4 5 <workflow> RANDOMFOREST Test <tibble [24 × 4]>
# calibration and performance
ensemble_fit_avg %>%
modeltime_calibrate(testing(splits$splits[[sp]])) %>%
modeltime_accuracy() %>%
table_modeltime_accuracy(.interactive = FALSE)| Accuracy Table | ||||||||
| .model_id | .model_desc | .type | mae | mape | mase | smape | rmse | rsq |
|---|---|---|---|---|---|---|---|---|
| 1 | ENSEMBLE (MEAN): 4 MODELS | Test | 2.08 | 8.61 | 0.45 | 8.73 | 2.47 | 0.92 |
ensemble_fit_avg %>% modeltime_table() %>%
modeltime_forecast(actual_data = filter(gtemp_ts,year(index)>=2018)) %>%
plot_modeltime_forecast(.interactive = FALSE) + labs(title = "24-months ahead forecasts", y = "", caption = "modeltime" ) +
scale_x_date(breaks = scales::date_breaks("1 year"),
labels = scales::date_format("%Y"))Seems good performance in 2023, and similar pattern in 2024
During the 2023 summer, max temperatures higher than expected. And at the very end, max temperatures lower than expected.